Fit a Generalized Linear Model for Peak Offset
if exist('Gr','var')==0 % If table not in workspace, load it.
Gr = getfield(load(defaults.files('single_channel_stats_mlx_table')),'Gr');
We would like to fit a model that predicts the Peak Offset parameter that we extracted in analyze.stat.fit_gauspuls. We are primarily interested in the fixed effect of:
- GroupID : Is there an effect of whether the rat has an ischemia or not?
- PostOpDay : Is there a trend that correlates with recovery over time? Rats performed the task daily between post-op days 3 to 28.
- Area : Either "RFA" or "CFA"
We also think that the following random effects could have an impact on our model intercept:
- ChannelID : We expect some channels intrinsically to have an earlier or later peak relative to others.
- AnimalID : We expect some animals to have a different aggregate distribution of channels.
Finally, at the Channel level, we also think there could be an influence of coefficient for the following terms:
- Error_SS : The residual sum-of-squares from individual gauspuls trial fits.
- Duration : The total time that it took for the reach-to-grasp behavior to finish execution.
Fit model for PeakOffset
We can fit a generalized linear mixed-effects model (GLME) using the Matlab fitglme function that requires the 'Statistics and Machine Learning Toolbox', version 12.0. This function takes a model specified in Wilkinson Notation, which defines Fixed and Random effects. We will fit using the ApproximateLaplace method, which allows us to compare model likelihoods if needed (as opposed to a pseudolikelihood-based method). glme_Offset = fitglme(Gr,"PeakOffset~GroupID*PostOpDay*Area+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)", ...
'FitMethod','ApproximateLaplace');
disp(glme_Offset);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 20221
Fixed effects coefficients 8
Random effects coefficients 865
Covariance parameters 8
Distribution Normal
Link Identity
FitMethod ApproximateLaplace
Formula:
Linear Mixed Formula with 7 predictors.
Model fit statistics:
AIC BIC LogLikelihood Deviance
27147 27274 -13557 27115
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } -0.15056 0.013293 -11.327 20213 1.19e-29 -0.17662
{'GroupID_Intact' } -0.0085388 0.029703 -0.28747 20213 0.77376 -0.06676
{'Area_CFA' } 0.061321 0.018185 3.3721 20213 0.00074732 0.025677
{'PostOpDay' } 0.0022763 0.00081302 2.7998 20213 0.0051181 0.00068271
{'GroupID_Intact:Area_CFA' } 0.03572 0.040468 0.88267 20213 0.37743 -0.043601
{'GroupID_Intact:PostOpDay' } 0.00042269 0.0019799 0.21349 20213 0.83094 -0.003458
{'Area_CFA:PostOpDay' } -0.0023651 0.0011435 -2.0683 20213 0.038628 -0.0046066
{'GroupID_Intact:Area_CFA:PostOpDay'} -0.00012018 0.0027485 -0.043724 20213 0.96512 -0.0055075
Upper
-0.12451
0.049682
0.096964
0.0038699
0.11504
0.0043034
-0.00012371
0.0052672
Random effects covariance parameters:
Group: ChannelID (285 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.050767
{'Duration' } {'(Intercept)'} {'corr'} -0.64032
{'Error_SS' } {'(Intercept)'} {'corr'} -0.92757
{'Duration' } {'Duration' } {'std' } 0.012323
{'Error_SS' } {'Duration' } {'corr'} 0.88095
{'Error_SS' } {'Error_SS' } {'std' } 0.029449
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 0.0078282
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.46985
Model Fixed-Effect Coefficients
All probabilities are computed using two-tailed t-tests for a distribution with an estimated degrees of freedom of 20,213.
- (Intercept) : The term with the greatest effect on the model is the average offset value. On average, peak times precede the Grasp by approximately 150-ms (coefficient estimate of -0.15056 seconds). This effect is statistically significant, with tStat=-11.327 and p=1.19x10^-29. For other coefficients in this list, the tStat value is omitted, but you can check the table above in the corresponding column.
- GroupID[Intact] : All "dummy" variables that are created for fixed-effects involving a categorical variable have n-1 coefficients estimated (since they use the "reference" method). In this case, Ischemia is set to zero (the reference) and the coefficient for Intact is estimated. The coefficient estimate for the effect is very small, and the p-value (p=0.77) indicates that this is not a significant model term.
- Area[CFA] : As with GroupID, Area only has the coefficient for one level computed- CFA (the intact hemisphere primary motor cortex homolog, which is ipsilateral to the reaching forelimb). However, unlike GroupID, there is a significant effect for this main effect coefficient (p=0.00075), which indicates that channels in CFA have a peak offset that on average is approximately 60-ms (0.0613 seconds) closer to the Grasp than those in RFA. This makes sense: we expect RFA, a putative premotor homolog, to in general have peak-excitation earlier in time relative to the grasp behavior.
- PostOpDay : This is a continuous fixed-effect, and the model indicates it is also a significant term (p=0.0051). The coefficient value of 0.0022 (i.e. 2-ms per day) seems a little bit low, but over the course of 30 days that can be a 60-ms shift, which is substantial.
- GroupID[Intact]*Area[CFA], GroupID[Intact]*PostOpDay : Neither term tests statistically significant (p=0.38, and p=0.83 respectively). These terms indicate that the change in timing between CFA-RFA channels, as well as the slope of expected peak offset by day, does not change based on whether the rat has an ischemic infarct or not.
- Area[CFA]*PostOpDay : This interaction term tests significant at alpha=0.05 level, with a negative coefficient indicating that for CFA channels, there is a decline of approximately 2-ms per day that seems to match the expected overall fixed-effect of PostOpDay. This indicates that we expect the peak offset time to stay relatively constant in CFA, whereas in RFA (for either group), we expect the value to shift closer to the Grasp.
- Area[CFA]*GroupID[Intact]*PostOpDay : The three-way interaction is not significant, meaning there is no deviation in slope of expected Peak Offset by post-op day that is specific to one combination of Group and Area.
Check model fit based on explained variance
disp(glme_Offset.Rsquared);
Ordinary: 0.0186
Adjusted: 0.0182
One concern after checking the Rsquared values is that our model doesn't really capture a lot of the data.
Check model fit using scatter plots
analyze.stat.scatter_var(Gr,'PeakOffset_ms','AddLabel',false,'YLab',"Peak-Offset");
We can see from the scatter that the reason our model fit is poor is because the range of peak times relative to behavior on any given trial is quite large relative to any overall trends in average peak times either by group, area, or post-op day. In general, there seems to be a robust difference between RFA and CFA peaks, with RFA peaks tending to be earlier in general than CFA peaks, which is what we expected.
Plot residuals to see if we should restrict range
analyze.stat.panelized_residuals(glme_Offset);
Based on the distribution of residuals, we see that there is a multi-phasic histogram of residuals (which isn't good), and that a lot of our fitted values are on the range [-0.2 to 0.1] approximately.
Set the included Peak Offset cutoff points
We can look at the distribution of PeakOffset so we have an idea of what our response variable looks like in general (should have done this first).
figure('Name','Peak Offset Observed Distribution','Color','w','Units','Normalized','Position',[0.2 0.2 0.35 0.35]);
histogram(Gr.PeakOffset,100,'DisplayName','Counts');
xlabel('Peak Offset (sec)','FontName','Arial','Color','k','FontSize',14,'FontWeight','bold');
ylabel('Count','FontName','Arial','Color','k','FontSize',14,'FontWeight','bold');
title('Total Distribution of Peak Offset Times','FontName','Arial','FontSize',16,'FontWeight','bold','Color','k');
Add lines to the histogram to denote where we initialize our "cutoff" for exclusions in next section. These should move as sliders are adjusted.
h_min = line(x1,yy,'Color',[1.0 0.0 0.0],'LineWidth',2,'LineStyle',':','DisplayName','min\_peak\_offset');
h_max = line(x2,yy,'Color',[0.8 0.2 0.2],'LineWidth',2,'LineStyle',':','DisplayName','max\_peak\_offset');
legend(gca,'Location','Best');
Set min_peak_offset and max_peak_offset by dragging the sliders to the desired range.
set(h_min,'XData',ones(1,2).*min_peak_offset);
set(h_max,'XData',ones(1,2).*max_peak_offset);
Fit the model with exclusions applied
exclusions = (Gr.PeakOffset<min_peak_offset) | (Gr.PeakOffset>max_peak_offset);
glme_Offset_subset_times = fitglme(Gr,"PeakOffset~GroupID*PostOpDay*Area+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)", ...
'FitMethod','ApproximateLaplace','Exclude',exclusions);
disp(glme_Offset_subset_times);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 15108
Fixed effects coefficients 8
Random effects coefficients 850
Covariance parameters 8
Distribution Normal
Link Identity
FitMethod ApproximateLaplace
Formula:
Linear Mixed Formula with 7 predictors.
Model fit statistics:
AIC BIC LogLikelihood Deviance
10900 11022 -5434.2 10868
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } -0.23516 0.010794 -21.787 15100 1.2072e-103 -0.25632
{'GroupID_Intact' } -0.014185 0.024279 -0.58425 15100 0.55906 -0.061776
{'Area_CFA' } 0.056213 0.01535 3.6621 15100 0.000251 0.026125
{'PostOpDay' } 0.0017543 0.00068149 2.5742 15100 0.010057 0.00041848
{'GroupID_Intact:Area_CFA' } 0.017326 0.033789 0.51278 15100 0.60811 -0.048904
{'GroupID_Intact:PostOpDay' } 0.0019812 0.0016618 1.1922 15100 0.23319 -0.0012761
{'Area_CFA:PostOpDay' } -0.0023014 0.0009719 -2.3679 15100 0.0179 -0.0042064
{'GroupID_Intact:Area_CFA:PostOpDay'} -0.0012266 0.0023188 -0.52898 15100 0.59683 -0.0057717
Upper
-0.21401
0.033405
0.086301
0.0030901
0.083556
0.0052384
-0.00039637
0.0033186
Random effects covariance parameters:
Group: ChannelID (280 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.036349
{'Duration' } {'(Intercept)'} {'corr'} -0.95091
{'Error_SS' } {'(Intercept)'} {'corr'} -0.89947
{'Duration' } {'Duration' } {'std' } 0.042568
{'Error_SS' } {'Duration' } {'corr'} 0.99055
{'Error_SS' } {'Error_SS' } {'std' } 0.009932
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 0.0023549
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.34553
After applying the exclusions, our model has roughly the same parameter explanations.
disp(glme_Offset_subset_times.Rsquared);
Ordinary: 0.0059
Adjusted: 0.0054
And actually the fit is worse than before.
Look at the included data to make sure we are trying to fit a reasonable manifold
Depending upon how the data exclusion/inclusion criteria were applied, we could end up with an ill-posed regression.
analyze.stat.scatter_var(Gr(~exclusions,:),'PeakOffset_ms',...
'ShowChannels',true, ... % Plot individual channels
'ScatterType','mean',... % Look at channel means
'ShowCB',true,... % Do look at individual channel confidence bands
'YLim',[min_peak_offset max_peak_offset].*1e3);
From this plot, we should be able to tell pretty quickly if we have been too restrictive in setting the mean boundaries: the total number of samples will seem sparse in one of the panels. The optional parameter specifications in the call to analyze.stat.scatter_var alter what we are plotting a little bit: instead of individual trials, data points represent Channel Averages across all trials from a given recording session. Confidence bands show 95% bounds on Animal regression estimates (regression using all channels from a single animal to estimate expected (average) Peak-Offset on a given Post-Op Day). Unfortunately our estimates for Intact animals are all over the place, which might be why the model fits in general are pretty noisy or hard to interpret.
Re-plot Residuals for Peak-Offset fit using exclusions
analyze.stat.panelized_residuals(glme_Offset_subset_times);
It's still not very good. We can try to restrict to only sites that are Forelimb-Related or non-Forelimb-Related to see if it helps.
Restrict to only Forelimb-related
Using intracortical microstimulation (ICMS), we approximately co-registered the somatic territory neareste to a given microwire shank. Each channel therefore has a categorical tag indicating the corresponding movement representation: Distal Forelimb (DF); Proximal Forelimb (PF); Forelimb Boundary (DF-PF); Other (Neck, Vibrissae, Trunk, Shoulder, Mouth; "O"); or No Response (NR).
We can create two categories: one for forelimb-related and one for "other" and then make models grouped by those categories.
Create ICMS_Type variable and only use "Forelimb" sites
% Create new variable and use it for exclusions
forelimb_channels = (Gr.ICMS=="DF") | (Gr.ICMS=="PF") | (Gr.ICMS=="DF-PF");
Gr.ICMS_Type = categorical(forelimb_channels,[false true],["Other", "Forelimb"]);
exclusions = Gr.ICMS_Type=="Other";
Run GLME fit, including full range of Peak Offset times
glme_Offset_forelimb = fitglme(Gr,"PeakOffset~GroupID*PostOpDay*Area+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)", ...
'FitMethod','ApproximateLaplace','Exclude',exclusions);
disp(glme_Offset_forelimb);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 14943
Fixed effects coefficients 8
Random effects coefficients 673
Covariance parameters 8
Distribution Normal
Link Identity
FitMethod ApproximateLaplace
Formula:
Linear Mixed Formula with 7 predictors.
Model fit statistics:
AIC BIC LogLikelihood Deviance
20155 20276 -10061 20123
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } -0.13068 0.016964 -7.7034 14935 1.4076e-14 -0.16394
{'GroupID_Intact' } -0.041972 0.036525 -1.1491 14935 0.25053 -0.11357
{'Area_CFA' } 0.035889 0.021786 1.6473 14935 0.099509 -0.0068142
{'PostOpDay' } 0.0013544 0.0010478 1.2926 14935 0.19616 -0.00069939
{'GroupID_Intact:Area_CFA' } 0.072521 0.046386 1.5634 14935 0.11798 -0.018402
{'GroupID_Intact:PostOpDay' } 0.0029412 0.0024456 1.2026 14935 0.22914 -0.0018525
{'Area_CFA:PostOpDay' } -0.0012337 0.0013612 -0.90636 14935 0.36476 -0.0039018
{'GroupID_Intact:Area_CFA:PostOpDay'} -0.0027847 0.0031242 -0.89132 14935 0.37277 -0.0089086
Upper
-0.097431
0.029622
0.078592
0.0034082
0.16344
0.0077348
0.0014344
0.0033392
Random effects covariance parameters:
Group: ChannelID (221 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.045719
{'Duration' } {'(Intercept)'} {'corr'} -0.16379
{'Error_SS' } {'(Intercept)'} {'corr'} -0.94275
{'Duration' } {'Duration' } {'std' } 0.012834
{'Error_SS' } {'Duration' } {'corr'} 0.48341
{'Error_SS' } {'Error_SS' } {'std' } 0.03077
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 1.214e-06
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.47107
disp(glme_Offset_forelimb.Rsquared);
Ordinary: 0.0183
Adjusted: 0.0178
The model explains these data a little worse than the very first model. Interestingly, when restricted to only forelimb sites, no model fixed effect is significant at the alpha=0.05 level except for the (Intercept) term.
Visualize scatter of the data used in model
analyze.stat.scatter_var(Gr(~exclusions,:),'PeakOffset_ms','AddLabel',false,'YLab',"Peak-Offset");
Again, grey bands indicate 95% confidence bands for all trial data and each scatter point is a single trial value for a single channel. Nothing too interesting here.
Plot residuals of Forelimb-only Peak-Offset model
analyze.stat.panelized_residuals(glme_Offset_forelimb);
And again, the residuals are less than ideal.
Fit second model based on only non-forelimb sites
It may be the case that plasticity occurs via recruitment of non-forelimb sites, in which case those may actually be the more interesting channels to look at.
exclusions = Gr.ICMS_Type=="Forelimb";
glme_Offset_other = fitglme(Gr,"PeakOffset~GroupID*PostOpDay*Area+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)", ...
'FitMethod','ApproximateLaplace','Exclude',exclusions);
disp(glme_Offset_other);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 5278
Fixed effects coefficients 8
Random effects coefficients 202
Covariance parameters 8
Distribution Normal
Link Identity
FitMethod ApproximateLaplace
Formula:
Linear Mixed Formula with 7 predictors.
Model fit statistics:
AIC BIC LogLikelihood Deviance
7008 7113.1 -3488 6976
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } -0.16691 0.021695 -7.6933 5270 1.7004e-14 -0.20944
{'GroupID_Intact' } 0.043381 0.050366 0.86132 5270 0.3891 -0.055358
{'Area_CFA' } 0.13733 0.046851 2.9313 5270 0.0033901 0.045486
{'PostOpDay' } 0.0032838 0.0012789 2.5677 5270 0.010264 0.00077668
{'GroupID_Intact:Area_CFA' } -0.050095 0.65029 -0.077035 5270 0.9386 -1.3249
{'GroupID_Intact:PostOpDay' } -0.0038198 0.0033671 -1.1344 5270 0.25666 -0.010421
{'Area_CFA:PostOpDay' } -0.0055183 0.0026711 -2.0659 5270 0.038883 -0.010755
{'GroupID_Intact:Area_CFA:PostOpDay'} 0.022639 0.057165 0.39603 5270 0.6921 -0.089429
Upper
-0.12438
0.14212
0.22918
0.0057908
1.2247
0.0027811
-0.00028186
0.13471
Random effects covariance parameters:
Group: ChannelID (64 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.066764
{'Duration' } {'(Intercept)'} {'corr'} -1
{'Error_SS' } {'(Intercept)'} {'corr'} -1
{'Duration' } {'Duration' } {'std' } 0.021854
{'Error_SS' } {'Duration' } {'corr'} NaN
{'Error_SS' } {'Error_SS' } {'std' } 0.026273
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 0.021478
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.46599
The model terms generally agree with the original model.
disp(glme_Offset_other.Rsquared);
Ordinary: 0.0200
Adjusted: 0.0187
This model actually fits the data better than the original model.
Verify that the included data is appropriate
analyze.stat.scatter_var(Gr(~exclusions,:),'PeakOffset_ms','AddLabel',false,'YLab',"Peak-Offset");
The scatter plot shows us that we just don't have enough data to look at the model this way, since there are very few non-forelimb sites in the Intact CFA grouping. So it would be problematic to make too much of a conclusion from this restricted grouping.
analyze.stat.panelized_residuals(glme_Offset_other);
Fit Generalized Linear Model for Envelope Bandwidth
We use the same structure as for fitting Peak Offset, but this time we will use a log link since we are dealing with frequencies.
glme_BW = fitglme(Gr,"EnvelopeBW~GroupID*Area*PostOpDay+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)",...
'link','log','FitMethod','ApproximateLaplace');
disp(glme_BW);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 20221
Fixed effects coefficients 8
Random effects coefficients 865
Covariance parameters 8
Distribution Normal
Link Log
FitMethod ApproximateLaplace
Formula:
Linear Mixed Formula with 7 predictors.
Model fit statistics:
AIC BIC LogLikelihood Deviance
32094 32220 -16031 32062
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } -0.28928 0.02114 -13.684 20213 1.9453e-42 -0.33072
{'GroupID_Intact' } 0.083055 0.04403 1.8863 20213 0.059268 -0.0032484
{'Area_CFA' } 0.036227 0.025419 1.4252 20213 0.15411 -0.013596
{'PostOpDay' } 0.0071271 0.0010883 6.5488 20213 5.9399e-11 0.004994
{'GroupID_Intact:Area_CFA' } -0.066589 0.05623 -1.1842 20213 0.23634 -0.17681
{'GroupID_Intact:PostOpDay' } -0.0044754 0.0026345 -1.6987 20213 0.089382 -0.0096392
{'Area_CFA:PostOpDay' } -0.00095114 0.0015081 -0.63069 20213 0.52825 -0.0039071
{'GroupID_Intact:Area_CFA:PostOpDay'} 0.0012708 0.0036673 0.34653 20213 0.72895 -0.0059174
Upper
-0.24785
0.16936
0.086051
0.0092603
0.043627
0.00068848
0.0020048
0.008459
Random effects covariance parameters:
Group: ChannelID (285 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.058019
{'Duration' } {'(Intercept)'} {'corr'} 0.95341
{'Error_SS' } {'(Intercept)'} {'corr'} -0.8521
{'Duration' } {'Duration' } {'std' } 0.030709
{'Error_SS' } {'Duration' } {'corr'} -0.65451
{'Error_SS' } {'Error_SS' } {'std' } 0.041033
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 0.025564
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.52944
disp(glme_BW.Rsquared);
Ordinary: 0.0408
Adjusted: 0.0405
This model suggests there is a significant effect of Group on Envelope Bandwidth; however, when we go to look at the effect size it really is quite small relative to what we could think of making a conceivable difference in terms of a frequency parameter. It's possible that a change on the order of tenths of a Hz relates to some physiological process of interest, so we will plot these data and see if we can get a better intuition of what the parameter significance means.
analyze.stat.scatter_var(Gr,'EnvelopeBW',...
'YLab',"Envelope Bandwidth",...
'ShowChannels',true, ... % Plot individual channels
'ScatterType','mean',... % Look at channel means
'ShowCB',true,... % Do look at individual channel confidence bands
We can definitely see that there is some kind of by-day tendency across the board for a (small) increase in Envelope Bandwidth. A change from 0.8-Hz to 0.9-Hz does seem pretty small, but the difference in period between frequencies at that scale is actually substantial, so even that small change would correspond to a difference of approximately 100-ms. Therefore, it may make sense to do a transformation on this response variable and look at it in terms of the period to see if the model is better fit.
Create transformed response variable and fit response
First, create a variable called Envelope_Period for the purpose of visualizing on scatter plots, in units of milliseconds. Then, create a normalized log-transformed version of that variable, which should be better for fitting a statistical model.
Gr.Envelope_Period = 1000./Gr.EnvelopeBW;
Gr.Properties.VariableUnits{28} = 'ms';
Gr.Envelope_Period_Transformed = (log(Gr.Envelope_Period) - nanmean(log(Gr.Envelope_Period)))./nanstd(log(Gr.Envelope_Period));
Gr.Properties.VariableUnits{29} = 'z-score';
figure('Name','Difference in Distribution from Transformation');
subplot(1,2,1); histogram(Gr.Envelope_Period); title('Period (ms)'); subplot(1,2,2); histogram(Gr.Envelope_Period_Transformed); title('Transformed Distribution');
Whereas the original distribution is right-skewed, the new distribution is approximately "balanced" (although a kstest indicates that it is not statistically "normal").
Fit generalized linear model of transformed period variable
Note: since we are working with the transformed variable, we will not use the log link function.
glme_T_env = fitglme(Gr,"Envelope_Period_Transformed~GroupID*Area*PostOpDay+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)",...
'FitMethod','ApproximateLaplace');
disp(glme_T_env);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 20221
Fixed effects coefficients 8
Random effects coefficients 865
Covariance parameters 8
Distribution Normal
Link Identity
FitMethod ApproximateLaplace
Formula:
Envelope_Period_Transformed ~ 1 + GroupID*Area + GroupID*PostOpDay + Area*PostOpDay + GroupID:Area:PostOpDay + (1 + Duration + Error_SS | ChannelID) + (1 | AnimalID)
Model fit statistics:
AIC BIC LogLikelihood Deviance
57005 57132 -28487 56973
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower Upper
{'(Intercept)' } 0.21343 0.035652 5.9865 20213 2.18e-09 0.14355 0.28331
{'GroupID_Intact' } -0.1247 0.072365 -1.7232 20213 0.084871 -0.26654 0.017143
{'Area_CFA' } -0.069896 0.039975 -1.7485 20213 0.080395 -0.14825 0.0084586
{'PostOpDay' } -0.011027 0.0017286 -6.3792 20213 1.8192e-10 -0.014415 -0.0076387
{'GroupID_Intact:Area_CFA' } 0.10531 0.087052 1.2098 20213 0.22638 -0.065316 0.27594
{'GroupID_Intact:PostOpDay' } 0.005889 0.0041955 1.4037 20213 0.16043 -0.0023344 0.014112
{'Area_CFA:PostOpDay' } 0.0018009 0.0024115 0.74681 20213 0.45519 -0.0029258 0.0065277
{'GroupID_Intact:Area_CFA:PostOpDay'} -0.0030283 0.0057998 -0.52214 20213 0.60158 -0.014396 0.0083398
Random effects covariance parameters:
Group: ChannelID (285 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.070021
{'Duration' } {'(Intercept)'} {'corr'} 0.48848
{'Error_SS' } {'(Intercept)'} {'corr'} -0.95512
{'Duration' } {'Duration' } {'std' } 0.081314
{'Error_SS' } {'Duration' } {'corr'} -0.20808
{'Error_SS' } {'Error_SS' } {'std' } 0.067877
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 0.051751
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.98085
The main fixed-effects are significant, but none of the interaction effects test as significant.
disp(glme_T_env.Rsquared);
Ordinary: 0.0282
Adjusted: 0.0279
When we plot the scatter, we will use the non-transformed version so that the data values make more sense.
analyze.stat.scatter_var(Gr,...
'YLab',"Envelope Period",...
'ShowChannels',true, ... % Plot individual channels
'ScatterType','mean',... % Look at channel means
'YLim',[0 2500]); % Even a value of 2500-ms is long on the timescale we are interested in
This trend pretty much matches what we describe above. It may relate to the Duration random effect: if the behavior in general gets faster over time, this could be due to some common-source. We can look at statistics for the Duration random effect:
[~,~,STATS] = randomEffects(glme_T_env);
K = [STATS.Name] == "Duration";
pVal = coefTest(glme_T_env,zeros(1,8),0,'REContrast',double(K'));
The coefficients for Duration trend towards significance, suggesting that this effect could relate to what we have observed.
Fit generalized linear model for Center Frequency
We use the same structure as for fitting Peak Offset, but this time we will use a log link since we are dealing with frequencies.
glme_FC = fitglme(Gr,"PeakFreq~GroupID*Area*PostOpDay+(1+Error_SS+Duration|ChannelID)+(1|AnimalID)",...
'link','log','FitMethod','ApproximateLaplace');
disp(glme_FC);
Generalized linear mixed-effects model fit by ML
Model information:
Number of observations 20221
Fixed effects coefficients 8
Random effects coefficients 865
Covariance parameters 8
Distribution Normal
Link Log
FitMethod ApproximateLaplace
Formula:
Linear Mixed Formula with 7 predictors.
Model fit statistics:
AIC BIC LogLikelihood Deviance
-3291.9 -3165.3 1662 -3323.9
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } -0.35086 0.0094488 -37.133 20213 4.9489e-292 -0.36939
{'GroupID_Intact' } 0.044337 0.020317 2.1823 20213 0.029097 0.0045154
{'Area_CFA' } 0.014038 0.011898 1.1799 20213 0.23806 -0.0092824
{'PostOpDay' } 0.0031184 0.00052319 5.9603 20213 2.5588e-09 0.0020929
{'GroupID_Intact:Area_CFA' } -0.023285 0.026409 -0.8817 20213 0.37795 -0.075048
{'GroupID_Intact:PostOpDay' } -0.0046548 0.001286 -3.6197 20213 0.00029567 -0.0071753
{'Area_CFA:PostOpDay' } 0.00033304 0.00072662 0.45834 20213 0.64672 -0.0010912
{'GroupID_Intact:Area_CFA:PostOpDay'} 0.0011917 0.0017664 0.67464 20213 0.49991 -0.0022706
Upper
-0.33234
0.08416
0.037358
0.0041439
0.028479
-0.0021342
0.0017573
0.004654
Random effects covariance parameters:
Group: ChannelID (285 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std' } 0.020537
{'Duration' } {'(Intercept)'} {'corr'} 0.061836
{'Error_SS' } {'(Intercept)'} {'corr'} 0.0099773
{'Duration' } {'Duration' } {'std' } 0.027731
{'Error_SS' } {'Duration' } {'corr'} -0.99742
{'Error_SS' } {'Error_SS' } {'std' } 0.013627
Group: AnimalID (10 Levels)
Name1 Name2 Type Estimate
{'(Intercept)'} {'(Intercept)'} {'std'} 0.0097354
Group: Error
Name Estimate
{'sqrt(Dispersion)'} 0.22147
disp(glme_FC.Rsquared);
Ordinary: 0.0256
Adjusted: 0.0253
analyze.stat.scatter_var(Gr,'PeakFreq',...
'YLab','Center Frequency',...
'ShowChannels',true, ... % Plot individual channels
'ScatterType','mean',... % Look at channel means
'ShowCB',true,... % Do look at individual channel confidence bands
analyze.stat.panelized_residuals(glme_FC);